In this hands-on exercise, you will learn how to handle geospatial data in R by using appropriate R packages.
By the end of this hands-on exercise, you should acquire the following competencies:
Before you can start using R, you are required to extract the necessary data sets from the appropriate source
Before we get started, it is important for us to ensure that all the R packages we need have been installed.
The code chunk:
packages = c('sp', 'rgdal', 'rgeos', 'sf', 'tidyverse')
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}
In this hands-on exercise, you are required to import the following geospatial data into R:
In this section, you will learn how to import geospatial data in ESRI shapefile and Google’s KML formats into R as simple feature data.frame.
Before getting started, you are encouraged to read 2. Reading, Writing and Converting Simple Features
The code chunk below uses st_read() function of sf package to import MP14_SUBZONE_WEB_PL data into R as simple feature data.frame.
sf_mpsz = st_read(dsn = "data/geospatial",
layer = "MP14_SUBZONE_WEB_PL")
## Reading layer `MP14_SUBZONE_WEB_PL' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 323 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
## Projected CRS: SVY21
The code chunk below uses st_read() function of sf package to import the CyclingPath layer into R as simple feature data.frame.
sf_cyclingpath = st_read(dsn = "data/geospatial",
layer = "CyclingPath")
## Reading layer `CyclingPath' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial' using driver `ESRI Shapefile'
## Simple feature collection with 1625 features and 2 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 12711.19 ymin: 28711.33 xmax: 42626.09 ymax: 48948.15
## Projected CRS: SVY21
The code chunk below uses st_read() function of sf package to import pre-school-location-kml layer into R as simple feature data.frame.
sf_preschool = st_read("data/geospatial/pre-schools-location-kml.kml")
## Reading layer `PRESCHOOLS_LOCATION' from data source `D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial\pre-schools-location-kml.kml' using driver `KML'
## Simple feature collection with 1359 features and 2 fields
## Geometry type: POINT
## Dimension: XYZ
## Bounding box: xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
## z_range: zmin: 0 zmax: 0
## Geodetic CRS: WGS 84
Notice that sf_preschool simple features data.frame is in wgs84 coordinates system. In Section 2.3, you will learn how to transform the data.frame into svy21 projected coordinates systems.
Next, let us examine the structure of the newly created simple feature data.frame. There are at least two ways you can used to examine the structure of a simple feature data.frame.
First, we can view the structure of the simple feature data.frame by using the Environment of RStudio. This is the most handy way. Alternatively, the glimpse() can be used display the structure of the newly created simple feature data.frame.
glimpse(sf_mpsz)
## Rows: 323
## Columns: 16
## $ OBJECTID <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, ~
## $ SUBZONE_NO <int> 1, 1, 3, 8, 3, 7, 9, 2, 13, 7, 12, 6, 1, 5, 1, 1, 3, 2, 2, ~
## $ SUBZONE_N <chr> "MARINA SOUTH", "PEARL'S HILL", "BOAT QUAY", "HENDERSON HIL~
## $ SUBZONE_C <chr> "MSSZ01", "OTSZ01", "SRSZ03", "BMSZ08", "BMSZ03", "BMSZ07",~
## $ CA_IND <chr> "Y", "Y", "Y", "N", "N", "N", "N", "Y", "N", "N", "N", "N",~
## $ PLN_AREA_N <chr> "MARINA SOUTH", "OUTRAM", "SINGAPORE RIVER", "BUKIT MERAH",~
## $ PLN_AREA_C <chr> "MS", "OT", "SR", "BM", "BM", "BM", "BM", "SR", "QT", "QT",~
## $ REGION_N <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "CENT~
## $ REGION_C <chr> "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR", "CR",~
## $ INC_CRC <chr> "5ED7EB253F99252E", "8C7149B9EB32EEFC", "C35FEFF02B13E0E5",~
## $ FMEL_UPD_D <date> 2014-12-05, 2014-12-05, 2014-12-05, 2014-12-05, 2014-12-05~
## $ X_ADDR <dbl> 31595.84, 28679.06, 29654.96, 26782.83, 26201.96, 25358.82,~
## $ Y_ADDR <dbl> 29220.19, 29782.05, 29974.66, 29933.77, 30005.70, 29991.38,~
## $ SHAPE_Leng <dbl> 5267.381, 3506.107, 1740.926, 3313.625, 2825.594, 4428.913,~
## $ SHAPE_Area <dbl> 1630379.27, 559816.25, 160807.50, 595428.89, 387429.44, 103~
## $ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((31495.56 30..., MULTIPOLYGON (~
Notice that the last column of a simple feature data.frame is always called geometry. It is known as simple feature list-column (an object of class sfc (refer to the Topic 2 slides for more discussion.)
You can also check the contents of sf_mpsz data.frame by using summary().
The code chunk:
summary(sf_mpsz)
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C
## Min. : 1.0 Min. : 1.000 Length:323 Length:323
## 1st Qu.: 81.5 1st Qu.: 2.000 Class :character Class :character
## Median :162.0 Median : 4.000 Mode :character Mode :character
## Mean :162.0 Mean : 4.625
## 3rd Qu.:242.5 3rd Qu.: 6.500
## Max. :323.0 Max. :17.000
## CA_IND PLN_AREA_N PLN_AREA_C REGION_N
## Length:323 Length:323 Length:323 Length:323
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## REGION_C INC_CRC FMEL_UPD_D X_ADDR
## Length:323 Length:323 Min. :2014-12-05 Min. : 5093
## Class :character Class :character 1st Qu.:2014-12-05 1st Qu.:21864
## Mode :character Mode :character Median :2014-12-05 Median :28465
## Mean :2014-12-05 Mean :27257
## 3rd Qu.:2014-12-05 3rd Qu.:31674
## Max. :2014-12-05 Max. :50425
## Y_ADDR SHAPE_Leng SHAPE_Area geometry
## Min. :19579 Min. : 871.5 Min. : 39438 MULTIPOLYGON :323
## 1st Qu.:31776 1st Qu.: 3709.6 1st Qu.: 628261 epsg:NA : 0
## Median :35113 Median : 5211.9 Median : 1229894 +proj=tmer...: 0
## Mean :36106 Mean : 6524.4 Mean : 2420882
## 3rd Qu.:39869 3rd Qu.: 6942.6 3rd Qu.: 2106483
## Max. :49553 Max. :68083.9 Max. :69748299
Lastly the head() can be used to list the first few records in the data.frame by using the code chunk below.
head(sf_mpsz, n=4)
## Simple feature collection with 4 features and 15 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 26403.48 ymin: 28369.47 xmax: 32362.39 ymax: 30396.46
## Projected CRS: SVY21
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N
## 1 1 1 MARINA SOUTH MSSZ01 Y MARINA SOUTH
## 2 2 1 PEARL'S HILL OTSZ01 Y OUTRAM
## 3 3 3 BOAT QUAY SRSZ03 Y SINGAPORE RIVER
## 4 4 8 HENDERSON HILL BMSZ08 N BUKIT MERAH
## PLN_AREA_C REGION_N REGION_C INC_CRC FMEL_UPD_D X_ADDR
## 1 MS CENTRAL REGION CR 5ED7EB253F99252E 2014-12-05 31595.84
## 2 OT CENTRAL REGION CR 8C7149B9EB32EEFC 2014-12-05 28679.06
## 3 SR CENTRAL REGION CR C35FEFF02B13E0E5 2014-12-05 29654.96
## 4 BM CENTRAL REGION CR 3775D82C5DDBEFBD 2014-12-05 26782.83
## Y_ADDR SHAPE_Leng SHAPE_Area geometry
## 1 29220.19 5267.381 1630379.3 MULTIPOLYGON (((31495.56 30...
## 2 29782.05 3506.107 559816.2 MULTIPOLYGON (((29092.28 30...
## 3 29974.66 1740.926 160807.5 MULTIPOLYGON (((29932.33 29...
## 4 29933.77 3313.625 595428.9 MULTIPOLYGON (((27131.28 30...
In this section, you will learn how to work with projection by using appropriate functions of sf package.
In this section, you will learn how to assign EPSG code to sf_mpsz simple features data.frame.
First, checking the projection of sf_mpsz by using st_crs() by using the code chunk below.
st_crs(sf_mpsz)
## Coordinate Reference System:
## User input: SVY21
## wkt:
## PROJCRS["SVY21",
## BASEGEOGCRS["SVY21[WGS84]",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]],
## ID["EPSG",6326]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["Degree",0.0174532925199433]]],
## CONVERSION["unnamed",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
Next, assigning EPSG 3414 to sf_mpsz simple features data.frame by using st_set_crs().
sf_mpsz3414 <- st_set_crs(sf_mpsz, 3414)
Lets check the CSR again.
st_crs(sf_mpsz3414)
## Coordinate Reference System:
## User input: EPSG:3414
## wkt:
## PROJCRS["SVY21 / Singapore TM",
## BASEGEOGCRS["SVY21",
## DATUM["SVY21",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4757]],
## CONVERSION["Singapore Transverse Mercator",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["northing (N)",north,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["easting (E)",east,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Cadastre, engineering survey, topographic mapping."],
## AREA["Singapore - onshore and offshore."],
## BBOX[1.13,103.59,1.47,104.07]],
## ID["EPSG",3414]]
Notice that sf_mpsz3414 simple features data.frame is in EPSG: 3414 now.
In Section 2.1.3, we had revealed that sf_preschool simple features data.frame is in wgs84 geographic coordinates system.
st_crs(sf_preschool)
## Coordinate Reference System:
## User input: WGS 84
## wkt:
## GEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## CS[ellipsoidal,2],
## AXIS["geodetic latitude (Lat)",north,
## ORDER[1],
## ANGLEUNIT["degree",0.0174532925199433]],
## AXIS["geodetic longitude (Lon)",east,
## ORDER[2],
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]]
Next, we will transform sf_preschool simple features data.frame onto svy21 projected coordinate system (i.e. EPSG 3414) by using st_transform().
sf_preschool3414 <- st_transform(sf_preschool, 3414)
st_crs(sf_preschool3414)
## Coordinate Reference System:
## User input: EPSG:3414
## wkt:
## PROJCRS["SVY21 / Singapore TM",
## BASEGEOGCRS["SVY21",
## DATUM["SVY21",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4757]],
## CONVERSION["Singapore Transverse Mercator",
## METHOD["Transverse Mercator",
## ID["EPSG",9807]],
## PARAMETER["Latitude of natural origin",1.36666666666667,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",103.833333333333,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["Scale factor at natural origin",1,
## SCALEUNIT["unity",1],
## ID["EPSG",8805]],
## PARAMETER["False easting",28001.642,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",38744.572,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["northing (N)",north,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["easting (E)",east,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["Cadastre, engineering survey, topographic mapping."],
## AREA["Singapore - onshore and offshore."],
## BBOX[1.13,103.59,1.47,104.07]],
## ID["EPSG",3414]]
In this section, you will learn how to import an aspatial data (i.e. Singapore Airbnb listings.csv) into R as a tibble data.frame. Then, convert the tibble data.frame into a simple features data.frame by using its x-coordinates and y-coordinates columns.
In the code chunk below read_csv() of readr package is used to parse listing.csv into R as a tibble data.frame.
listings <- read_csv("data/aspatial/listings.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## id = col_double(),
## name = col_character(),
## host_id = col_double(),
## host_name = col_character(),
## neighbourhood_group = col_character(),
## neighbourhood = col_character(),
## latitude = col_double(),
## longitude = col_double(),
## room_type = col_character(),
## price = col_double(),
## minimum_nights = col_double(),
## number_of_reviews = col_double(),
## last_review = col_date(format = ""),
## reviews_per_month = col_double(),
## calculated_host_listings_count = col_double(),
## availability_365 = col_double()
## )
The code chunk below converts listings data frame into a simple feature data frame by using st_as_sf() of sf packages
Things to learn from the arguments:
listing_sf <- st_as_sf(listings,
coords = c("longitude", "latitude"),
crs= 4326)
glimpse(listing_sf)
## Rows: 4,388
## Columns: 15
## $ id <dbl> 49091, 50646, 56334, 71609, 71896, 7190~
## $ name <chr> "COZICOMFORT LONG TERM STAY ROOM 2", "P~
## $ host_id <dbl> 266763, 227796, 266763, 367042, 367042,~
## $ host_name <chr> "Francesca", "Sujatha", "Francesca", "B~
## $ neighbourhood_group <chr> "North Region", "Central Region", "Nort~
## $ neighbourhood <chr> "Woodlands", "Bukit Timah", "Woodlands"~
## $ room_type <chr> "Private room", "Private room", "Privat~
## $ price <dbl> 81, 80, 67, 177, 81, 81, 206, 52, 40, 7~
## $ minimum_nights <dbl> 180, 90, 6, 90, 90, 90, 1, 14, 14, 90, ~
## $ number_of_reviews <dbl> 1, 18, 20, 20, 24, 48, 29, 20, 13, 133,~
## $ last_review <date> 2013-10-21, 2014-12-26, 2015-10-01, 20~
## $ reviews_per_month <dbl> 0.01, 0.21, 0.17, 0.18, 0.20, 0.40, 0.2~
## $ calculated_host_listings_count <dbl> 2, 1, 2, 5, 5, 5, 5, 47, 47, 7, 1, 47, ~
## $ availability_365 <dbl> 365, 365, 365, 365, 1, 365, 181, 350, 0~
## $ geometry <POINT [°]> POINT (103.7958 1.44255), POINT (~
Next, we will transform the listing simple feature from wgs84 geographic coordinates systems to svy21 projected coordinates system by using st_transform()
listing_sf <- st_transform(listing_sf, 3414)
glimpse(listing_sf)
## Rows: 4,388
## Columns: 15
## $ id <dbl> 49091, 50646, 56334, 71609, 71896, 7190~
## $ name <chr> "COZICOMFORT LONG TERM STAY ROOM 2", "P~
## $ host_id <dbl> 266763, 227796, 266763, 367042, 367042,~
## $ host_name <chr> "Francesca", "Sujatha", "Francesca", "B~
## $ neighbourhood_group <chr> "North Region", "Central Region", "Nort~
## $ neighbourhood <chr> "Woodlands", "Bukit Timah", "Woodlands"~
## $ room_type <chr> "Private room", "Private room", "Privat~
## $ price <dbl> 81, 80, 67, 177, 81, 81, 206, 52, 40, 7~
## $ minimum_nights <dbl> 180, 90, 6, 90, 90, 90, 1, 14, 14, 90, ~
## $ number_of_reviews <dbl> 1, 18, 20, 20, 24, 48, 29, 20, 13, 133,~
## $ last_review <date> 2013-10-21, 2014-12-26, 2015-10-01, 20~
## $ reviews_per_month <dbl> 0.01, 0.21, 0.17, 0.18, 0.20, 0.40, 0.2~
## $ calculated_host_listings_count <dbl> 2, 1, 2, 5, 5, 5, 5, 47, 47, 7, 1, 47, ~
## $ availability_365 <dbl> 365, 365, 365, 365, 1, 365, 181, 350, 0~
## $ geometry <POINT [m]> POINT (23824.77 47135.4), POINT (~
To view the spatial data, plot() function of sf package can be used.
plot(sf_mpsz)
Although simple feature data.frame is gaining popularity again sp’s Spatial* classes, there are, however, many geospatial analysis packages require the input geospatial data in sp’s Spatial* classes. In this section, you will learn how to convert simple feature data.frame to sp’s Spatial* class.
The code chunk below uses as_Spatial() of sf package to convert sf_preschool3414 simple feature data.frame to sp’s Spatial* class.
sp_preschool <- as_Spatial(sf_preschool3414)
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO", prefer_proj
## = prefer_proj): Discarded datum Unknown based on WGS84 ellipsoid in Proj4
## definition
## Warning in showSRID(SRS_string, format = "PROJ", multiline = "NO", prefer_proj =
## prefer_proj): Discarded datum SVY21 in Proj4 definition
Notice that the output is a SpatialPointsDataFrame class.
You can check the content of the SpatialPointsDataFrame by using summary() as shown in the code chunk below.
summary(sp_preschool)
## Object of class SpatialPointsDataFrame
## Coordinates:
## min max
## coords.x1 11203.01 45404.24
## coords.x2 25667.60 49300.88
## coords.x3 0.00 0.00
## Is projected: TRUE
## proj4string :
## [+proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +units=m +no_defs]
## Number of points: 1359
## Data attributes:
## Name Description
## Length:1359 Length:1359
## Class :character Class :character
## Mode :character Mode :character
DIY: Using the steps you had learned, convert sf_mpsz3414 and sf_mpsz simple feature data.frame to sp’s Spatial* classes. After the conversion, examine the output spatial classes carefully. Write short notes to decribe your onservation of the output spatial classes.
Beside providing functions to handle and wrangle geospatial data, sf package also provides functions to perform geoprocessing tasks list most GIS toolkits provide.
In this section, you will learn how to perform two popularly GIS geoprocessing tasks, namely: buffering and point-in-polygon count by using sf package.
The scenario:
The authority is planning to upgrade the exiting cycling path. To do so, they need to acquire 5 metres reserve land on the both sides of the current cycling path. You are tasked to determine the extend of the land need to be acquired and their total areas.
The solution:
Creating 5-meter buffers around cycling path by using st_buffer() and calculate the total area of the buffers by using st_area().
sf_buffer_cycling <- st_buffer(sf_cyclingpath,
dist=5, nQuadSegs = 30)
sf_buffer_cycling$AREA <- st_area(sf_buffer_cycling)
sum(sf_buffer_cycling$AREA)
## 773143.9 [m^2]
Because the output is in tibble data.table format, you can plot the area easily by using geom_histogram() of ggplot2.
ggplot(data = sf_buffer_cycling,
aes(x=as.numeric(AREA))) +
geom_histogram(bins=30,
color="black",
fill="light blue")
The scenario:
A pre-school services group want to find out numbers of pre-school in each Planning Subzone.
The solution:
The code chunk below first identify pre-schools located inside each Planning Subzone by using st_intersects(). Then, the length() is used to calculate numbers of pre-school fall inside each planning subzone.
sf_mpsz3414$`PreSch Count`<- lengths(st_intersects(sf_mpsz3414, sf_preschool3414))
Warning: You should not confuse with st_intersection().
You can check the summary statistics of the newly derived PreSch Count field by using summary() as shown in the code chunk below.
summary(sf_mpsz3414$`PreSch Count`)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 0.000 2.000 4.207 6.000 37.000
To list the planning subzone with the most number of of pre-school, the top_n() of dplyr package is used as shown in the code chunk below.
top_n(sf_mpsz3414, 1, `PreSch Count`)
## Simple feature collection with 1 feature and 16 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: 23449.05 ymin: 46001.23 xmax: 25594.22 ymax: 47996.47
## Projected CRS: SVY21 / Singapore TM
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N PLN_AREA_C
## 1 290 3 WOODLANDS EAST WDSZ03 N WOODLANDS WD
## REGION_N REGION_C INC_CRC FMEL_UPD_D X_ADDR Y_ADDR
## 1 NORTH REGION NR C90769E43EE6B0F2 2014-12-05 24506.64 46991.63
## SHAPE_Leng SHAPE_Area geometry PreSch Count
## 1 6603.608 2553464 MULTIPOLYGON (((24786.75 46... 37
Quiz: Calculate the density of pre-school by planning subzone. With the help of appropriate graphical method, describe the distribution of the newly derived variable.
The code chunk below uses st_area() of sf package to derive the area of each planning subzone.
sf_mpsz3414$Area <- sf_mpsz3414 %>%
st_area()
sf_mpsz3414 <- sf_mpsz3414 %>%
mutate(`PreSch Density` = `PreSch Count`/Area * 1000000)
ggplot(data=sf_mpsz3414,
aes(x= as.numeric(`PreSch Density`)))+
geom_histogram(bins=20,
color="black", fill="light blue")
ggplot(data=sf_mpsz3414,
aes(y = `PreSch Count`, x= as.numeric(`PreSch Density`)))+
geom_point(color="black", fill="light blue")
In this section, you will learn how to handle geospatial data in shapefile format using sp, gdal and rgeos packages in R.
In this section, you will learn how to import MP14_SUBZONE_WEB_PL GIS layer into R. It is stored in shapefile format. The spatial data model of this GIS data are polygon objects.
To import the GIS data layer into R, readOGR() from rgdal package will be used.
The data importing task is performed by using the code chunk below:
mpsz_sp <- readOGR(dsn = "data/geospatial",
layer = "MP14_SUBZONE_WEB_PL")
## OGR data source with driver: ESRI Shapefile
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial", layer: "MP14_SUBZONE_WEB_PL"
## with 323 features
## It has 15 fields
Notice that mpsz_sp is in SpatialPolygonDataFrame.
You can check the contents of mpsz_sp data object by using summary().
The code chunk:
summary(mpsz_sp)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x 2667.538 56396.44
## y 15748.721 50256.33
## Is projected: TRUE
## proj4string :
## [+proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1
## +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs]
## Data attributes:
## OBJECTID SUBZONE_NO SUBZONE_N SUBZONE_C
## Min. : 1.0 Min. : 1.000 Length:323 Length:323
## 1st Qu.: 81.5 1st Qu.: 2.000 Class :character Class :character
## Median :162.0 Median : 4.000 Mode :character Mode :character
## Mean :162.0 Mean : 4.625
## 3rd Qu.:242.5 3rd Qu.: 6.500
## Max. :323.0 Max. :17.000
## CA_IND PLN_AREA_N PLN_AREA_C REGION_N
## Length:323 Length:323 Length:323 Length:323
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## REGION_C INC_CRC FMEL_UPD_D X_ADDR
## Length:323 Length:323 Length:323 Min. : 5093
## Class :character Class :character Class :character 1st Qu.:21864
## Mode :character Mode :character Mode :character Median :28465
## Mean :27257
## 3rd Qu.:31674
## Max. :50425
## Y_ADDR SHAPE_Leng SHAPE_Area
## Min. :19579 Min. : 871.5 Min. : 39438
## 1st Qu.:31776 1st Qu.: 3709.6 1st Qu.: 628261
## Median :35113 Median : 5211.9 Median : 1229894
## Mean :36106 Mean : 6524.4 Mean : 2420882
## 3rd Qu.:39869 3rd Qu.: 6942.6 3rd Qu.: 2106483
## Max. :49553 Max. :68083.9 Max. :69748299
Let’s view the first few records in the mpsz_sp.
The code chunk
head(mpsz_sp, n=4)
To view the spatial data, plot() of Base R can be used.
The code chunk:
plot(mpsz_sp)
Using the functions you had learned, import the Pre-School and Cycling Path GIS data files into R spatial objects.
The solution:
The pre-schools GIS data is in kml format. Before we can import the data file into R, we will use ogrListLayers function of rgdal package to check the actual data structure of the kml data file.
ogrListLayers("data/geospatial/pre-schools-location-kml.kml")
## [1] "PRESCHOOLS_LOCATION"
## attr(,"driver")
## [1] "KML"
## attr(,"nlayers")
## [1] 1
Notice that the file called pre-schools-location-kml is just the folder (refer to the list above). In order to important the layer, we need to use PRESCHOOL_LOCATION layer instead.
The code chunk below will do the trick.
preschool <- readOGR("data/geospatial/pre-schools-location-kml.kml",
"PRESCHOOLS_LOCATION")
## OGR data source with driver: KML
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial\pre-schools-location-kml.kml", layer: "PRESCHOOLS_LOCATION"
## with 1359 features
## It has 2 fields
In this section, you will learn how to import a line geospatial data into R. The geospatial data is the CyclingPath shapefile from LTA DataMall (https://www.mytransport.sg/content/mytransport/home/dataMall.html)
cyclingpath <- readOGR (dsn = "data/geospatial",
layer = "CyclingPath")
## OGR data source with driver: ESRI Shapefile
## Source: "D:\tskam\GeoDSA\Hands-on_Ex\Hands-on_Ex02\data\geospatial", layer: "CyclingPath"
## with 1625 features
## It has 2 fields
mpsz_svy21 <- spTransform(mpsz_sp,
CRS("+init=epsg:3414"))
Now, it is your turn to change the projection system of the preschool data set from wgs84 to svy21.
The solution:
preschool_svy21 <- spTransform(preschool,
CRS("+init=epsg:3414"))
The scenario
The authority is planning to upgrade the exiting cycling path. To do so, they need to acquire 5 metres reserve land on the both sides of the current cycling path. You are tasked to determine the extend of the land need to be acquired and their total areas.
The solution:
buf_cyclingpath <- gBuffer(cyclingpath, width = 5)
The solution:
buf_cyclingpath <- gBuffer(cyclingpath, byid = TRUE,
width = 5)
The solution:
buf_cyclingpath@data$Area <- gArea(buf_cyclingpath,
byid = TRUE)
sum(buf_cyclingpath@data$Area)
## [1] 771024.9